Skip to content

MySQL 9: Add support for VECTOR type #15431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 20, 2024

Conversation

kamil-tekiela
Copy link
Member

@kamil-tekiela kamil-tekiela commented Aug 15, 2024

This still doesn't fix the heap crash that happens without this change.

Fixes #15432

@kamil-tekiela
Copy link
Member Author

I am not sure if this should go into PHP 8.2 or not.

@cmb69
Copy link
Member

cmb69 commented Aug 16, 2024

I am not sure if this should go into PHP 8.2 or not.

Fixing the heap corruption should target PHP-8.2 (unless it would be too disruptive a change). Proper support for new MySQL datatypes should probably target master. As I understand it, this PR is supposed to return the data as string; if you could add a respective test case, that would clarify.

@kamil-tekiela
Copy link
Member Author

I don't know what's causing the heap corruption or how to fix it. This PR only adds support for the new data type hence why I targeted master. A test would be nice, but I think it can only be tested with a MySQL 9 server and our test suite doesn't offer that yet.

@cmb69
Copy link
Member

cmb69 commented Aug 16, 2024

I don't know what's causing the heap corruption or how to fix it.

Me neither (might have something to do with mysql_nd's arena allocation).

A test would be nice, but I think it can only be tested with a MySQL 9 server and our test suite doesn't offer that yet.

Even if the test would be skipped in CI, it would nonetheless be useful for local developement, and (for me) to see what's going on (i.e. if a string is returned, what the format would look like).

@kamil-tekiela kamil-tekiela marked this pull request as draft August 16, 2024 11:24
@kamil-tekiela kamil-tekiela marked this pull request as ready for review August 16, 2024 17:22
@cmb69
Copy link
Member

cmb69 commented Aug 16, 2024

Thanks for adding the test. So MySQL sends the data as 4-byte binary floats in little-endian format accross the wire (not in network byte order)? If so, users would need to unpack("g*", $data) – that's okay for me (assuming that's sufficiently portable).

@kamil-tekiela
Copy link
Member Author

I don't actually know what users would do with this in PHP. I assume they wouldn't usually do this and instead use VECTOR_TO_STRING and then json_decode. I guess they could unpack it too if they really want. The point is that it sends the data the same way as BLOB fields do.

@cmb69
Copy link
Member

cmb69 commented Aug 16, 2024

I don't actually know what users would do with this in PHP. I assume they wouldn't usually do this and instead use VECTOR_TO_STRING and then json_decode.

It was just because @richloh wrote:

A php data variable that holds the results, preferrably an array of floats with an array length that matches the vector length.
The native mysql client also has this type of output (as of v9).

So giving back an array of doubles (aka. "full support") would be nice for best compatibility with libmysql-client, but I'm fine with having only "basic support", i.e. passing back a binary string, but it might be a good idea to document this in UPGRADING (and to mention unpack("g*") there). Considering that pdo_mysql would behave differently depending on mysqlnd vs. libmysql-client in this case, seems to be harder to document. Since you're doing most of the work of implementing and documenting, I think you should decide. :) But maybe we get further input from others; after all, there is no need to hurry; there are still supposedly more than 5 weeks to RC1.

@kamil-tekiela
Copy link
Member Author

Considering that pdo_mysql would behave differently depending on mysqlnd vs. libmysql-client in this case, seems to be harder to document.

I don't think they would. Correct me if I'm wrong but I thought libmysql would work the exact same way. And in terms of the MySQL client, it returns a hexadecimal formatted string by default. I think this is the full support.

The bug report started in phpMyAdmin, which I believe is the reason for the expectation. PMA can implement such helper functionality once PHP supports this data type. But PHP should stick as close as possible to what libmysql does.

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I think this is good as is.

Note that the CI failures are unrelated to this PR, and could be resolve by rebasing again; not sure if that's important, though.

@kamil-tekiela kamil-tekiela merged commit a1ab846 into php:master Aug 20, 2024
8 of 10 checks passed
@kamil-tekiela kamil-tekiela deleted the VECTOR-type branch August 20, 2024 15:35
@RRosalia
Copy link

Missing PDO support here

@kamil-tekiela
Copy link
Member Author

Missing PDO support here

Can you expand on that? What is missing in PDO?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PHP crashes when processing a MySQL DB query with a new Vector format introduced in MySQL 9.0
3 participants